Saving and Loading Models

In this notebook, I'll show you how to save and load models with PyTorch. This is important because you'll often want to load previously trained models to use in making predictions or to continue training on new data.



In [1]:

    
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt

import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms

import helper
import fc_model



In [2]:

    
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,))])
# Download and load the training data
trainset = datasets.FashionMNIST('F_MNIST_data/',
                                 download=True,
                                 train=True,
                                 transform=transform)
trainloader = torch.utils.data.DataLoader(dataset=trainset,
                                          batch_size=64,
                                          shuffle=True)

# Download and load the test data
testset = datasets.FashionMNIST('F_MNIST_data/',
                                download=True,
                                train=False,
                                transform=transform)
testloader = torch.utils.data.DataLoader(dataset=testset,
                                         batch_size=64,
                                         shuffle=True)

Here we can see one of the images.



In [3]:

    
image, label = next(iter(trainloader))
helper.imshow(image[0,:]);

Train a network

To make things more concise here, I moved the model architecture and training code from the last part to a file called fc_model. Importing this, we can easily create a fully-connected network with fc_model.Network, and train the network using fc_model.train. I'll use this model (once it's trained) to demonstrate how we can save and load models.



In [4]:

    
# Create the network, define the criterion and optimizer

model = fc_model.Network(784, 10, [512, 256, 128])
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)



In [5]:

    
fc_model.train(model,
               trainloader,
               testloader,
               criterion,
               optimizer,
               epochs=2)









    



Epoch: 1/2..  Training Loss: 1.697..  Test Loss: 0.958..  Test Accuracy: 0.596
Epoch: 1/2..  Training Loss: 1.028..  Test Loss: 0.736..  Test Accuracy: 0.709
Epoch: 1/2..  Training Loss: 0.848..  Test Loss: 0.677..  Test Accuracy: 0.746
Epoch: 1/2..  Training Loss: 0.779..  Test Loss: 0.645..  Test Accuracy: 0.752
Epoch: 1/2..  Training Loss: 0.742..  Test Loss: 0.608..  Test Accuracy: 0.765
Epoch: 1/2..  Training Loss: 0.731..  Test Loss: 0.601..  Test Accuracy: 0.772
Epoch: 1/2..  Training Loss: 0.724..  Test Loss: 0.604..  Test Accuracy: 0.779
Epoch: 1/2..  Training Loss: 0.705..  Test Loss: 0.589..  Test Accuracy: 0.783
Epoch: 1/2..  Training Loss: 0.630..  Test Loss: 0.548..  Test Accuracy: 0.794
Epoch: 1/2..  Training Loss: 0.636..  Test Loss: 0.543..  Test Accuracy: 0.789
Epoch: 1/2..  Training Loss: 0.644..  Test Loss: 0.544..  Test Accuracy: 0.801
Epoch: 1/2..  Training Loss: 0.637..  Test Loss: 0.546..  Test Accuracy: 0.798
Epoch: 1/2..  Training Loss: 0.612..  Test Loss: 0.538..  Test Accuracy: 0.800
Epoch: 1/2..  Training Loss: 0.622..  Test Loss: 0.526..  Test Accuracy: 0.805
Epoch: 1/2..  Training Loss: 0.610..  Test Loss: 0.520..  Test Accuracy: 0.813
Epoch: 1/2..  Training Loss: 0.598..  Test Loss: 0.502..  Test Accuracy: 0.819
Epoch: 1/2..  Training Loss: 0.580..  Test Loss: 0.504..  Test Accuracy: 0.818
Epoch: 1/2..  Training Loss: 0.605..  Test Loss: 0.493..  Test Accuracy: 0.824
Epoch: 1/2..  Training Loss: 0.553..  Test Loss: 0.494..  Test Accuracy: 0.824
Epoch: 1/2..  Training Loss: 0.589..  Test Loss: 0.501..  Test Accuracy: 0.818
Epoch: 1/2..  Training Loss: 0.595..  Test Loss: 0.505..  Test Accuracy: 0.820
Epoch: 1/2..  Training Loss: 0.565..  Test Loss: 0.495..  Test Accuracy: 0.820
Epoch: 1/2..  Training Loss: 0.547..  Test Loss: 0.482..  Test Accuracy: 0.825
Epoch: 2/2..  Training Loss: 0.580..  Test Loss: 0.487..  Test Accuracy: 0.829
Epoch: 2/2..  Training Loss: 0.597..  Test Loss: 0.492..  Test Accuracy: 0.814
Epoch: 2/2..  Training Loss: 0.540..  Test Loss: 0.502..  Test Accuracy: 0.816
Epoch: 2/2..  Training Loss: 0.584..  Test Loss: 0.462..  Test Accuracy: 0.831
Epoch: 2/2..  Training Loss: 0.532..  Test Loss: 0.483..  Test Accuracy: 0.828
Epoch: 2/2..  Training Loss: 0.530..  Test Loss: 0.475..  Test Accuracy: 0.830
Epoch: 2/2..  Training Loss: 0.568..  Test Loss: 0.460..  Test Accuracy: 0.831
Epoch: 2/2..  Training Loss: 0.531..  Test Loss: 0.466..  Test Accuracy: 0.835
Epoch: 2/2..  Training Loss: 0.507..  Test Loss: 0.455..  Test Accuracy: 0.830
Epoch: 2/2..  Training Loss: 0.519..  Test Loss: 0.455..  Test Accuracy: 0.834
Epoch: 2/2..  Training Loss: 0.534..  Test Loss: 0.463..  Test Accuracy: 0.832
Epoch: 2/2..  Training Loss: 0.551..  Test Loss: 0.457..  Test Accuracy: 0.830
Epoch: 2/2..  Training Loss: 0.485..  Test Loss: 0.451..  Test Accuracy: 0.836
Epoch: 2/2..  Training Loss: 0.504..  Test Loss: 0.455..  Test Accuracy: 0.840
Epoch: 2/2..  Training Loss: 0.532..  Test Loss: 0.442..  Test Accuracy: 0.839
Epoch: 2/2..  Training Loss: 0.521..  Test Loss: 0.463..  Test Accuracy: 0.835
Epoch: 2/2..  Training Loss: 0.527..  Test Loss: 0.470..  Test Accuracy: 0.831
Epoch: 2/2..  Training Loss: 0.563..  Test Loss: 0.468..  Test Accuracy: 0.830
Epoch: 2/2..  Training Loss: 0.524..  Test Loss: 0.457..  Test Accuracy: 0.832
Epoch: 2/2..  Training Loss: 0.539..  Test Loss: 0.451..  Test Accuracy: 0.834
Epoch: 2/2..  Training Loss: 0.496..  Test Loss: 0.459..  Test Accuracy: 0.829
Epoch: 2/2..  Training Loss: 0.525..  Test Loss: 0.444..  Test Accuracy: 0.835
Epoch: 2/2..  Training Loss: 0.484..  Test Loss: 0.433..  Test Accuracy: 0.844

Saving and loading networks

As you can imagine, it's impractical to train a network every time you need to use it. Instead, we can save trained networks then load them later to train more or use them for predictions.

The parameters for PyTorch networks are stored in a model's state_dict. We can see the state dict contains the weight and bias matrices for each of our layers.



In [6]:

    
print("Our model: \n\n", model, '\n')
print("The state dict keys: \n\n", model.state_dict().keys())









    



Our model: 

 Network(
  (hidden_layers): ModuleList(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): Linear(in_features=256, out_features=128, bias=True)
  )
  (output): Linear(in_features=128, out_features=10, bias=True)
  (dropout): Dropout(p=0.5)
) 

The state dict keys: 

 odict_keys(['hidden_layers.0.weight', 'hidden_layers.0.bias', 'hidden_layers.1.weight', 'hidden_layers.1.bias', 'hidden_layers.2.weight', 'hidden_layers.2.bias', 'output.weight', 'output.bias'])

The simplest thing to do is simply save the state dict with torch.save. For example, we can save it to a file 'checkpoint.pth'.



In [7]:

    
torch.save(model.state_dict(), './models/checkpoint.pth')

Then we can load the state dict with torch.load.



In [8]:

    
state_dict = torch.load('./models/checkpoint.pth')
print(state_dict.keys())









    



odict_keys(['hidden_layers.0.weight', 'hidden_layers.0.bias', 'hidden_layers.1.weight', 'hidden_layers.1.bias', 'hidden_layers.2.weight', 'hidden_layers.2.bias', 'output.weight', 'output.bias'])

And to load the state dict in to the network, you do model.load_state_dict(state_dict).



In [9]:

    
model.load_state_dict(state_dict)

Seems pretty straightforward, but as usual it's a bit more complicated. Loading the state dict works only if the model architecture is exactly the same as the checkpoint architecture. If I create a model with a different architecture, this fails.

This means we need to rebuild the model exactly as it was when trained. Information about the model architecture needs to be saved in the checkpoint, along with the state dict.



In [10]:

    
checkpoint = {'input_size': 784,
              'output_size': 10,
              'hidden_layers': [each.out_features for each in model.hidden_layers],
              'state_dict': model.state_dict()}

torch.save(checkpoint, './models/checkpoint.pth')

Now the checkpoint has all the necessary information to rebuild the trained model. You can easily make that a function if you want. Similarly, we can write a function to load checkpoints.



In [11]:

    
def load_checkpoint(filepath):
    checkpoint = torch.load(filepath)
    model = fc_model.Network(checkpoint['input_size'],
                             checkpoint['output_size'],
                             checkpoint['hidden_layers'])
    model.load_state_dict(checkpoint['state_dict'])
    
    return model



In [12]:

    
model = load_checkpoint('./models/checkpoint.pth')
print(model)









    



Network(
  (hidden_layers): ModuleList(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): Linear(in_features=256, out_features=128, bias=True)
  )
  (output): Linear(in_features=128, out_features=10, bias=True)
  (dropout): Dropout(p=0.5)
)



In [13]:

    
for name, param in model.named_parameters():
    if param.requires_grad:
        print(name)
        print(':')
        print(param.data)









    



hidden_layers.0.weight
:
tensor([[ 1.0829e-02,  4.5339e-02,  8.0516e-03,  ...,  2.6247e-02,
          3.7050e-02,  3.8520e-02],
        [ 5.2457e-02,  4.4805e-02, -9.9332e-03,  ...,  7.7281e-03,
          4.9005e-02, -9.9437e-03],
        [ 1.9361e-02,  5.1222e-02, -5.8016e-03,  ...,  5.7663e-02,
          4.0456e-03,  3.7330e-02],
        ...,
        [ 5.2480e-02,  7.3297e-02,  7.4309e-02,  ...,  3.8753e-02,
          3.2082e-02,  6.3545e-02],
        [ 2.4634e-02,  3.0253e-02, -2.2247e-04,  ...,  4.4412e-02,
          3.2836e-02,  7.4753e-04],
        [ 3.7587e-02,  4.9512e-02, -6.3258e-03,  ...,  1.9048e-02,
          9.3763e-03,  3.9780e-02]])
hidden_layers.0.bias
:
tensor(1.00000e-02 *
       [ 1.5554,  0.3996, -0.6347, -4.8910, -5.1043, -1.4945, -3.4822,
        -2.8807, -0.3576, -2.4537, -5.1826,  0.4211, -4.4131, -3.4063,
        -6.2709, -3.5774,  0.0725, -4.7808,  0.0737, -1.7426, -3.0948,
        -5.5929, -4.2693, -3.0641,  1.4016,  1.0384, -2.7645, -4.2842,
        -6.7227,  2.2724, -4.5530, -1.1599,  1.5698, -1.4506, -6.3218,
        -0.0697, -5.7829,  1.7127, -3.6935, -0.9262, -0.9056,  2.1626,
         1.2496, -4.5024, -1.6133,  1.6747, -0.9604, -1.9875, -0.7912,
        -4.3272,  0.9652, -1.8704, -2.1438, -4.4492, -3.5219, -3.4263,
         4.7547, -0.8211, -2.5087, -0.3960, -4.3540, -2.5627, -1.9489,
        -5.8510,  0.1623,  1.5327, -0.3955, -6.0002,  0.3458, -0.9321,
        -5.9316, -5.3880, -3.6308, -1.3448, -3.6931, -1.4857, -3.8756,
         0.7857, -4.2899, -4.8960, -2.2860,  2.5688, -4.4554,  2.0992,
        -4.6785, -4.4121, -2.5174, -4.0328, -3.2457, -4.6358,  1.9645,
         1.6071, -2.8621, -5.3439, -1.2701, -2.4987, -2.5807, -4.3676,
         0.3362, -4.4199,  3.2432, -4.0222,  1.6863,  0.6484,  2.5599,
        -5.4544,  0.9120, -0.8605,  2.9424, -0.8439, -0.6645,  0.3489,
         3.6244, -3.6407, -2.1201, -4.5082, -0.9237,  0.0171,  0.3122,
        -1.3533, -0.8534,  0.0314,  0.4999, -2.8903, -6.8033, -3.0806,
         0.5371, -4.9612,  0.0826, -3.3988, -1.6155, -2.9548, -1.6001,
        -6.6797, -3.2409, -2.6883, -8.9049,  1.3763,  3.0567, -1.9371,
        -3.7219, -5.8082, -1.8806, -1.5720, -5.6419,  3.0249, -6.4043,
        -1.8793,  5.0272,  2.5271, -2.1029,  0.7612, -1.2757,  0.7979,
        -5.3649,  1.8823, -0.4439,  0.5792,  0.0843, -1.5652, -4.4240,
        -0.6134, -3.1917,  0.6476,  2.4168, -4.9566, -4.4274, -1.4255,
        -0.9770, -1.0938,  0.1144, -1.4702, -1.1057, -3.2535, -3.2541,
        -1.6973,  2.0964, -0.0804, -1.3241,  0.1857, -3.5148, -0.8050,
        -1.1034, -3.6103,  2.5221,  1.7708, -3.4690,  5.1180,  0.4143,
        -0.4727, -1.8019, -6.5180, -0.6858, -5.9046, -2.4353, -5.8112,
        -0.2929, -2.4040, -1.7264, -1.4745, -0.7689, -0.5762,  1.5652,
        -1.3059, -1.7829, -4.0500, -4.3969, -3.0819,  2.2239,  2.7080,
        -3.9396, -1.4105,  3.1130, -3.0452, -3.2027,  0.1959,  0.9367,
         1.4633, -4.4931, -2.3665, -2.1650, -1.6202, -3.2507, -5.4824,
        -2.2461, -7.5182, -3.7723, -1.7428,  0.8512,  0.1428, -0.4273,
         1.3881, -4.3491,  0.0213, -5.6596, -3.7902,  0.2518, -3.0256,
         1.7647, -3.3212, -5.1420, -4.4754, -0.6846, -4.4514, -1.4454,
        -2.0903, -0.5072, -2.8670, -2.1337, -4.5834, -3.9767, -2.7299,
        -3.4532, -2.6783, -3.0347,  0.0457, -0.4191, -1.4672, -1.4037,
         1.2127, -4.0225, -1.0978, -1.2116, -4.0986,  1.4499,  1.3385,
        -0.7907, -4.4064, -4.7641, -2.7676,  0.8454, -1.3255, -6.1474,
        -3.8247, -0.4966, -0.3182,  1.9660, -0.8506, -6.7921, -2.5290,
        -1.6724,  0.2139, -2.9559,  1.0269, -3.9409,  0.5693, -3.7440,
        -2.7818,  1.3524, -0.4758, -0.9263, -1.6033, -0.8155, -1.7794,
         1.5588, -1.2769, -4.3825, -0.2264, -0.3811, -3.4124, -6.8752,
        -0.7259,  1.0220,  2.5653,  0.9893,  1.7573, -3.1533, -6.7402,
        -1.6808, -1.0588,  3.0601,  0.9788, -0.3608, -2.8596, -0.8665,
         3.7868, -3.2981, -5.2129,  1.8906, -6.6408,  2.3287, -3.3220,
        -2.7652,  0.1999, -0.0970, -5.0880, -1.2999,  0.4069,  0.5597,
        -3.3457, -2.2677, -0.8758,  1.9164, -3.7147, -5.8837, -2.9798,
        -1.3036, -1.4781, -5.1753, -5.0226, -2.7652, -4.0373, -2.5718,
        -2.3557, -4.9033, -4.2089,  1.0401, -3.9987,  0.2383, -3.4376,
         1.1743, -0.3617, -6.9834, -4.9250, -3.2169, -4.6477, -1.8281,
        -2.2085, -4.3780,  0.9646,  1.8190, -4.1040, -2.9562,  0.8349,
        -5.7403, -3.4483,  0.1762, -2.5738, -0.7801,  0.3996,  0.3972,
        -1.6364,  1.1942, -2.2079, -4.1637, -6.5555,  1.3472,  1.6293,
        -3.3496,  2.4886, -0.5084,  0.9696, -1.2069, -4.7094, -2.3743,
        -0.7997,  0.6172, -0.5146,  0.0991, -2.6831,  0.3039, -2.2274,
        -2.5577, -4.0728, -3.4411, -3.3205,  1.1931, -0.8791,  0.0584,
        -3.1678, -1.1829, -4.3132, -0.1648, -4.1806, -4.3616,  4.0266,
        -3.0130, -1.1526, -6.0047, -1.7044, -4.5755, -0.7073, -1.9030,
        -0.3355, -4.2986, -3.1029, -1.5948,  4.0267,  0.3807,  1.4613,
        -0.8308, -3.9018,  0.1019,  0.2039, -5.8080, -0.9891,  2.7168,
        -1.7124,  4.7324,  0.2304,  1.2550, -4.0879, -1.0307, -2.0391,
        -3.4327,  0.5847, -4.4238, -6.6341, -2.3533, -0.6144, -1.5478,
        -2.0947, -3.5137, -1.6885,  0.1934, -4.8920, -0.4860,  0.2025,
        -4.0762, -5.3088, -0.3548, -2.6123, -0.0338, -2.5914, -3.1565,
        -1.9958, -5.0179, -3.0304, -0.7405, -0.6559, -3.9276, -6.2218,
         5.0155, -0.9255,  0.5750, -3.9964, -2.0009, -4.8255, -5.0175,
        -1.7138,  4.3084, -0.9889,  0.9090, -2.1724, -0.7042, -0.6519,
        -5.2467,  1.2370, -3.5613,  0.5085, -7.4037, -4.3691, -1.3237,
        -4.7111, -2.1082,  2.3178, -3.3548, -2.3646, -2.9804, -3.0662,
         1.5451, -2.5111, -5.7957, -3.7835,  1.1532,  1.1830, -3.3441,
        -5.8026,  0.9282, -4.1475, -1.2374, -4.9739, -4.2373, -1.7390,
        -4.6219,  0.0654, -2.4472, -4.2836, -5.1096, -5.6131, -2.6277,
        -2.8689])
hidden_layers.1.weight
:
tensor([[-2.6265e-02, -3.2702e-02, -6.7861e-02,  ..., -7.0579e-02,
         -8.2066e-03, -6.1184e-02],
        [-7.4485e-03, -4.2106e-02,  1.0911e-02,  ...,  5.2386e-02,
          3.2527e-02, -8.5260e-03],
        [ 1.2800e-02, -5.3001e-02,  2.9989e-02,  ..., -5.0162e-02,
         -7.5460e-02, -3.7015e-02],
        ...,
        [-3.4502e-02, -1.6451e-02,  5.4008e-02,  ...,  4.3615e-03,
          2.0149e-02, -6.7967e-02],
        [ 6.8173e-03, -5.1444e-02, -1.1273e-01,  ..., -7.2895e-02,
         -3.1580e-02, -7.3391e-02],
        [ 2.4152e-02, -2.7973e-02, -1.3149e-02,  ..., -5.4552e-02,
          3.9517e-02, -4.5414e-02]])
hidden_layers.1.bias
:
tensor([ 0.0366,  0.0496,  0.0605,  0.0059,  0.0069,  0.0153, -0.0158,
         0.0182, -0.0173,  0.0146,  0.0367,  0.0154,  0.0231,  0.0698,
         0.0314,  0.0933,  0.0786,  0.0571, -0.0340,  0.0252, -0.0541,
        -0.0229,  0.0179, -0.0137,  0.0579, -0.0763,  0.0282,  0.0312,
         0.0008,  0.0033,  0.0655, -0.0635,  0.0270, -0.0322,  0.0516,
        -0.0263, -0.0196,  0.0330,  0.0049, -0.0019,  0.0201,  0.0201,
         0.0379, -0.0196, -0.0032, -0.0020,  0.1257, -0.0713,  0.0506,
        -0.0522, -0.0005, -0.0264,  0.0615,  0.0482,  0.0432,  0.0623,
         0.0589,  0.0329, -0.0314, -0.0709,  0.0631, -0.0130, -0.0519,
        -0.0003,  0.0257, -0.0195,  0.0129,  0.0117, -0.0303,  0.0284,
        -0.0194,  0.0402, -0.0536,  0.0042,  0.0869,  0.0226, -0.0028,
         0.0553, -0.0062,  0.0596,  0.0118,  0.0253,  0.0809,  0.0150,
         0.0464,  0.0029,  0.0467, -0.0177,  0.0714,  0.0178,  0.0148,
         0.0240, -0.0274,  0.0299,  0.0492, -0.0305,  0.0360,  0.0025,
        -0.0008,  0.0463,  0.0166, -0.0224, -0.0607,  0.0210, -0.0297,
         0.0406,  0.0403, -0.0434, -0.0012, -0.0121,  0.1054,  0.0344,
         0.0443, -0.0277, -0.0191,  0.0204, -0.0201,  0.0721, -0.0044,
        -0.0443, -0.0372, -0.0054,  0.0332,  0.1165,  0.0835,  0.0115,
        -0.0510, -0.0323,  0.0093,  0.0488,  0.0897,  0.0611,  0.0205,
         0.0110, -0.0272,  0.0599,  0.0088, -0.0024, -0.0239, -0.0240,
         0.0056, -0.0570, -0.0583,  0.0510,  0.0616, -0.0332, -0.0311,
         0.0540, -0.0499,  0.0343,  0.0411,  0.0881, -0.0127, -0.0142,
        -0.0224, -0.0004,  0.0101, -0.0538, -0.0427,  0.0221, -0.1317,
         0.0164,  0.0143,  0.0743,  0.0620, -0.0400, -0.0083,  0.0025,
        -0.0094, -0.0178,  0.0220,  0.0937, -0.0415,  0.0212,  0.0743,
         0.0487, -0.0208,  0.0599,  0.0633, -0.0311,  0.0206, -0.0260,
         0.0987, -0.0081, -0.0461,  0.0123, -0.0067, -0.0295,  0.0303,
         0.0248, -0.0316, -0.0401,  0.0011,  0.0634, -0.0116, -0.0212,
         0.0047, -0.0112,  0.0047,  0.0655, -0.0518,  0.0316, -0.0698,
         0.0461,  0.0901,  0.0777,  0.0355, -0.0091,  0.0235, -0.0472,
        -0.0727,  0.0267,  0.0458,  0.0230, -0.0281,  0.0486,  0.0077,
        -0.0177, -0.0730,  0.0495, -0.0045, -0.0296,  0.0414,  0.0420,
        -0.0335,  0.0430, -0.0068,  0.0127, -0.0308,  0.0260, -0.0041,
        -0.0281,  0.0397,  0.0009,  0.0290,  0.0799,  0.0184,  0.0325,
         0.0797,  0.0715,  0.0618,  0.0356,  0.0089,  0.0516,  0.0349,
         0.0457,  0.0351,  0.0331,  0.0718,  0.1032, -0.0088,  0.0086,
        -0.1119,  0.0308, -0.0230,  0.0099])
hidden_layers.2.weight
:
tensor([[-1.6671e-02,  1.0372e-01,  8.9660e-03,  ...,  8.0120e-03,
          7.6748e-02, -2.3311e-02],
        [ 4.7139e-02,  4.8234e-02,  2.9784e-02,  ..., -5.3300e-02,
         -6.4239e-02,  3.5125e-02],
        [ 1.6764e-02, -4.5369e-02,  6.5188e-02,  ..., -3.1202e-02,
         -3.1494e-02, -4.3172e-02],
        ...,
        [ 3.7986e-02, -6.7358e-04, -8.3423e-02,  ...,  2.2394e-02,
         -1.7312e-02, -2.1963e-03],
        [-2.5767e-02, -5.5936e-02, -5.7907e-02,  ..., -4.0737e-02,
         -5.7754e-02,  4.9244e-02],
        [-3.4055e-02, -1.0031e-01, -9.5010e-02,  ...,  4.6802e-02,
         -2.5861e-02, -3.5134e-02]])
hidden_layers.2.bias
:
tensor([ 0.0108,  0.1211,  0.1431,  0.1455,  0.0093,  0.0465, -0.0160,
         0.0363,  0.0961,  0.0855,  0.0458,  0.0870,  0.0610,  0.0188,
         0.1073,  0.0292, -0.0467, -0.0578, -0.0300,  0.1035,  0.0576,
         0.1527,  0.0353,  0.0471,  0.0411,  0.0081,  0.1280,  0.1140,
         0.0560,  0.0740, -0.0345,  0.0421,  0.0603,  0.0615,  0.0966,
        -0.0094,  0.1132,  0.0804,  0.0442,  0.0293, -0.0276,  0.0043,
         0.0470,  0.0081,  0.0970,  0.0695,  0.0443,  0.0579,  0.1180,
         0.0190,  0.1017,  0.0432,  0.0000,  0.0747,  0.1287, -0.0469,
         0.0355,  0.1246,  0.0472,  0.0315, -0.0717,  0.0306,  0.1579,
        -0.0087,  0.0026,  0.0960,  0.0103, -0.0590,  0.1515, -0.0181,
        -0.0068,  0.0903,  0.0387,  0.0449,  0.1004,  0.0807,  0.0001,
         0.0192,  0.0918,  0.0474, -0.0160,  0.0252,  0.1194,  0.0245,
         0.0353,  0.0843,  0.0343, -0.0509,  0.0018,  0.1433, -0.0444,
         0.0876, -0.0882, -0.0282,  0.0440,  0.0377,  0.1200,  0.0300,
         0.1102, -0.0404, -0.0507,  0.0812,  0.0857,  0.1127,  0.0803,
         0.0480, -0.0454,  0.0452,  0.0432,  0.0940,  0.0366, -0.0115,
         0.1181,  0.0233,  0.0839, -0.0083, -0.0331,  0.0368,  0.0344,
         0.0928,  0.0480,  0.1045, -0.0410, -0.0049,  0.0180,  0.0625,
        -0.0351,  0.1246])
output.weight
:
tensor([[-0.0493,  0.0518,  0.0788,  ..., -0.0503,  0.0412, -0.0691],
        [-0.0924,  0.0256, -0.0152,  ...,  0.0542,  0.0445, -0.1681],
        [ 0.0196,  0.0458,  0.0107,  ...,  0.0411,  0.0127,  0.0304],
        ...,
        [ 0.0659,  0.0217, -0.1601,  ..., -0.0893, -0.0439,  0.0177],
        [ 0.0236,  0.0779,  0.0577,  ..., -0.0943, -0.1055, -0.0572],
        [ 0.0497, -0.0077, -0.2081,  ..., -0.0216, -0.0464, -0.0830]])
output.bias
:
tensor([-0.1236, -0.2342,  0.1352,  0.0632,  0.0027, -0.0323,  0.0572,
        -0.0034,  0.0940, -0.0586])



In [14]:

    
name, params = next(model.named_parameters())



In [15]:

    
name









    Out[15]:





'hidden_layers.0.weight'



In [16]:

    
params









    Out[16]:





Parameter containing:
tensor([[ 1.0829e-02,  4.5339e-02,  8.0516e-03,  ...,  2.6247e-02,
          3.7050e-02,  3.8520e-02],
        [ 5.2457e-02,  4.4805e-02, -9.9332e-03,  ...,  7.7281e-03,
          4.9005e-02, -9.9437e-03],
        [ 1.9361e-02,  5.1222e-02, -5.8016e-03,  ...,  5.7663e-02,
          4.0456e-03,  3.7330e-02],
        ...,
        [ 5.2480e-02,  7.3297e-02,  7.4309e-02,  ...,  3.8753e-02,
          3.2082e-02,  6.3545e-02],
        [ 2.4634e-02,  3.0253e-02, -2.2247e-04,  ...,  4.4412e-02,
          3.2836e-02,  7.4753e-04],
        [ 3.7587e-02,  4.9512e-02, -6.3258e-03,  ...,  1.9048e-02,
          9.3763e-03,  3.9780e-02]])